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Abstract Body 



Background / Context: 

US students are not performing strongly in math. Only 39% of fourth graders, 34% of eighth 
graders, and 26% of twelfth graders performed at or above profieieney on the National 
Assessment of Educational Progress (NCES, 2009; NCES, 2010). Progress on successive NAEP 
tests has been meager and racial achievement gaps persist. Internationally, the 2009 average 
score of US students on the Program for International Student Assessment (PISA) was lower 
than the average score for OECD countries, despite small gains since the 2006 assessment 
(OECD, 2010). 

In the development of math skills over the course of primary and secondary education, the 
middle years are a key time when students’ math performance begins to lag (Lee and Eish, 

2010). Increasingly educators and researchers are seeking out instructional methods that allow 
teachers to meet students’ individual needs (Davis, 2011). Students come to classroom with 
varying levels of prior understanding, needing instruction in different skills, and with diverse 
interests and preferred learning styles. Sol is a new, individualized, technology-rich math 
program being implemented in three New York City middle schools. The program offers a high 
level of customization for each student, both in the content and material with which students 
engage, and in the teaching and learning modalities that are used to enhance students’ mastery of 
the material. Student, teacher, and parent surveys as well as ongoing assessment information 
inform how the program is tailored for each student. Sol is the recipient of a three-year, five 
million dollar Investing in Innovation (13) development grant from the federal Department of 
Education, and was named one of the top fifty inventions of 2009 by Time magazine. 

The proposed paper evaluates the effectiveness of Sol in improving math test scores in its first 
year of complete implementation. While the study adds to previous research on Sol conducted 
by the Education Development Center for Children and Technology (CCT) and the New York 
City Department of Education (NYCDOE), our study is the first independent evaluation of this 
new, expanding program. Our study’s preliminary assessment of impact will contribute to 
program development for future 13 development as Sol continues to evolve. 

Purpose / Objective / Research Question / Focus of Study: 

• What is the impact of the initial whole school version of Sol on students’ math achievement: 
1) as measured by performance on the New York State assessment in mathematics, and 2) as 
measured by performance on the interim, periodic mathematics assessments provided by 
Acuity? 

• To what extent does the impact of Sol differ across subgroups of students, including those 
defined by: 1) grade level; 2) prior mathematics achievement; 3) English language learning 
status; and 4) special education status? 

Setting: 

New York City Department of Education includes over 1500 schools serving more than 1.1 
million students. Sol is being implemented in three NYC middle schools: We shall refer to these 
schools as Manhattan School, Brooklyn School, and Bronx School. See Table 1 for descriptive 
statistics on these schools. 
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Population / Participants / Subjects: 

Sol replaced the traditional math instruction and curriculum for sixth, seventh and eighth graders 
in Manhattan School and Brooklyn, and for sixth graders and some seventh graders in Bronx 
School. The program serves all students with the exception of newly arrived English Language 
Learners (ELLs) and students with special education needs requiring small, self-contained 
classrooms. In total, Sol serves approximately 1,700 students. 

Intervention / Program / Practice: 

Sol is a program of mass customization of student learning in response to the diverse levels of 
math proficiency and different preferred learning modalities students bring with them. The 
program develops a “playlist” of the math skills a particular student needs to work on based on 
an automated analysis of a variety of assessment data. Data from a number of surveys about the 
student’s interests and learning style form a student profile that will influence the kinds of 
lessons that are presented to the child. A lesson bank offers lessons to address each skill in a 
number of different formats: large group, small group, virtual tutoring, collaborative group 
activities, and educational software. A learning algorithm matches each student’s profile to the 
required instructional content to generate a schedule for all students and teachers every school 
day. Every day the students are assessed further, and these data feed back into the algorithm to 
improve it and to add more information to the student’s profile. The schedule the algorithm 
generates for each student each day is flexible, however, allowing teachers to adjust the schedule 
as necessary for their students. 

Before this year’s first full implementation of Sol, the program went through a number of pilot 
steps. In summer 2009, Sol provided a four week long summer school program for 80 rising 
seventh graders. In spring 2010, it was adapted as an after school program for 240 sixth graders 
during seven weeks. Later that spring Sol became the school-day math instructional program for 
200 sixth graders during six weeks. 

Research Design: 

We estimate the impact of Sol on math achievement using a comparative interrupted times series 
methodology, a method commonly used to evaluate school- wide programs.* The first stage of 
this method is to construct a baseline model representing the trend in math achievement in each 
grade at each school. We use data from 2006 through 2010 to construct this baseline trend. 
Extending this trend one further year gives an estimate of what the scores might have been if Sol 
had not been implemented. The difference between this predicted score and the actual 2011 score 
is an estimate of the impact of Sol. However, we cannot necessarily attribute this deviation to 
Sol since other district- wide reforms and policies may have come on line during this period and 
influenced math achievement independent of Sol. 

The second stage of the analysis begins with the identification of a set of comparison schools. 

We use a combination of test-score trajectories from 2006 to 2010^ and student demographics^ to 



' This section draws highly on Howard Bloom’s methodological work (1995, 1999, and 2003). 

^The test scores and trajectories are based on cross-section longitudinal regression analysis. For all eligible schools, 
we use student-level math test scores between 2006 and 2010 to estimate grade specific intercepts in 2006 and year- 
to-year linear trends through 2010. 

^ These characteristics are: the percent of students who are English language learners, female, the percent who are 
Hispanic/Black, or Asian, the percent that have a special education designation (including only students that 
participated in the math test), the percent of students eligible for free and reduced price lunch, and the school-wide 
attendance rate. 
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identify six comparison schools for each Sol school."^ For potential comparison schools, we 
construct an index that measures the similarity to each Sol school and chose the six schools with 
the closest scores to form the comparison group.^ For these schools, we conduct the same 
interrupted time-series regression analysis for the comparison schools that is described above in 
Stage 1 for the Sol schools. Since these schools were not exposed to Sol, deviations from the 
baseline trend would be due to other reforms or initiatives being implement across the district or 
in selected middle schools like these. 

In the final stage of the analysis, we compare the differences estimated in Stage 1 with the 
differences estimated in Stage 2. Here, we estimate the difference between the test scores 
predicted by the baseline trend and the actual test scores in 201 1 for the Sol and comparison 
schools simultaneously (combining step 1 and 2 in one model). Thus, this so-called “difference- 
in-difference” approach contrasts the gains in 201 1 for the Sol schools to the gains for the 
comparison schools. This estimate represents the best indication of the impact Sol has on student 
math test scores over and above the influence of prior initiatives and trends and simultaneous 
interventions that may be underway across the district or in schools like those being served by 
Sol. 

In addition to the whole population of 6*, 7* and 8* graders in the selected schools, we will also 
estimate the impact of Sol for different subgroups of students separately. The analytic strategy 
for these analyses are the same as the strategy described above except that we will focus on 
discrete subgroups of students defined by prior performance levels and demographic 
characteristics. 

The minimum detectable effect size (MDES) for the full sample of three schools is between .24 
and .45 effect sizes. For a single grade, a single school, or the students of a subgroup (pooled 
across schools) the MDES is likely to range between .50 and .82 effect sizes. While these are 
moderate to large effect sizes, we nonetheless think it is entirely possible that we will detect an 
impact. Eor the pooled sample, scores would need to rise an average of 4 to 7.5 scale score 
points, which is roughly equivalent to the citywide growth New York City has witnessed over 
the past two to three years. Sol is a dramatically different mechanism for the delivery of math 
instruction and may have a large enough impact to be detected. 

Data Collection and Analysis: 

Our data sources include New York State math test scores from 2006 through 201 1. In addition, 
we will use Acuity math predictive scores (a low-stakes, periodic assessment meant to inform 
instruction and widely used in schools across the city) from early 201 1. We draw demographic 
information on school composition from the J Eorm, a publically-available dataset. 

Our analysis will estimate a linear baseline comparative interrupted time-series model: 

V = INT + . VR,,, + . PostZOlOy, + ft . ft,., + ft . Sol,,, 

+ ft*5oV.S'i!,„ + ft.5oly,.Post2010„,+ 



^ Potential comparison schools included the 189 New York City schools with a middle school grade configuration 
(Grades 6-8) that operated continuously between 2006 and 2011. 

^ Based on the concept of “Euclidian distances” used in many cluster analyses, the similarity index captures the 
multi-dimensional differences between each Sol school and each potential comparison school based on important 
background and performance characteristics. 
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Where: 




Test score for student i in school j in year t 

Year of observation for student i in school j, where -3, -2, -1, and 0 
correspond to 2006 - 2010, respectively and 1 corresponds to 2011 

1 if observation for student i in school j is from 2011, 0 if 
observation is from 2006-2010 

Vector of predictors of individual student characteristics for 
student i in school j in year t 

1 if school j is a Sol school, 0 otherwise 

Year of observation for Sol school j, where -3, -2, -1, and 0 
correspond to 2006 - 2010, respectively and 1 corresponds to 2011 

1 if school j is a Sol school and the year is 2011, 0 otherwise 

random error for student i in school j in year t 



POjt2010;f, 







* Pojt2010i,> = 



The parameter estimate Pe represents the estimated difference in test scores between Sol schools 
and non-Sol schools in 201 1 period accounting for differences in trends between the groups 
during the period 2006-2010 and accounting for differences in the demographic characteristics in 
the two groups of schools. We will run this analysis three times, once for each Sol school and its 
comparison schools, and then pool these results to form one estimate of the program impact. 

Findings / Results/Conclusions: 

Findings and conclusions with regard to our research questions are forthcoming. Figure 1 shows 
two findings from our initial analysis that suggest our research design is particularly strong. First, 
the baseline trends for both the Sol schools and the comparison schools are highly predictive of 
the observed data, and thus will serve as a strong predictor of the 201 1 scores. Second, the 
baseline trends for the Sol schools and the comparison schools show a high degree of similarity 
both in the levels of student performance and in the year-to-year growth. Because of this 
similarity, we may have a high degree of confidence that subsequent differences that may 
emerge between the schools in 201 1 are likely to be due to one school being exposed to Sol and 
the comparison schools not being exposed. From these two findings, we conclude that our 
research design is a good choice for these data. Table 1 gives the precise numbers displayed in 
Figure 1, as well as some demographic data about the two groups. 

One potential limitation of our study stems from Sol’s recruitment process. The schools in 
which it was implemented were purposefully identified as being good candidates for successful 
implementation, so we cannot rule out the possibility that any effect we find could be due to 
selection bias. A final limitation is our limited power, which yields a MDBS of .24 and .45 effect 
sizes. 
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Appendix B. Tables and Figures 



Table 1 



Characteristics of Sol Schools and their Comparison Schools 
Averages for 2006-2010 



Characteristic 


Sol 

Schools 


Comparison 

Schools 


Female (%) 


44.2% 


48.6% 


Race/ethnicity (%) 






White 


10.0% 


11.7% 


Black 


17.0% 


18.3% 


Hispanic 


32.6% 


43.6% 


Asian 


40.0% 


25.7% 


English language learners (%) 


26.8% 


16.1% 


Special education (%) 


7.2% 


6.0% 


Poverty^ (%) 


75.2 


68.4 


Peer index 


2.7 


2.7 


Observed math scores in 2010 


669.5 


672.5 


Estimated math scores in 2010 


671.6 


676.4 


Estimated yearly change in math 


7.5 


7.8 


scores between 2006 and 2010 







In 2009. 
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Figure 1: 

Baseline Math Test Score Trend and Projection 
Grade 6, Brooklyn School 




This chart demonstrates how well-suited the comparative interrupted time series methodology is to the present question. First, the 
predicted baseline models — both for the Sol schools in and for the comparison schools — are quite close to the actual observed scores. 
Second, the models for the Sol schools and the comparison schools are quite close to one another. Taken together, these two findings 
support the assumptions on which the comparative interrupted time series analysis is based. 
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